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Abstract 



Current formal models for quantum computation deal only with unitary gates operating 
on "pure quantum states". In these models it is difficult or impossible to deal formally 
^> I with several central issues: measurements in the middle of the computation; decoherence 

' and noise, using probabilistic subroutines, and more. It turns out, that the restriction to 

■ unitary gates and pure states is unnecessary. In this paper we generalize the formal model 
^ i of quantum circuits to a model in which the state can be a general quantum state, namely 

I a mixed state, or a "density matrix", and the gates can be general quantum operations, 

■ not necessarily unitary. The new model is shown to be equivalent in computational power 
, to the standard one, and the problems mentioned above essentially disappear. 

Cl^l The main result in this paper is a solution for the subroutine problem. The general 

function that a quantum circuit outputs is a probabilistic function. However, the question 
of using probabilistic functions as subroutines was not previously dealt with, the reason 
^ ' being that in the language of pure states, this simply can not be done. We define a natural 

notion of using general subroutines, and show that using general subroutines does not 
strengthen the model. 

^ I As an example of the advantages of analyzing quantum complexity using density ma- 

' trices, we prove a simple lower bound on depth of circuits that compute probabilistic func- 

tions. Finally, we deal with the question of inaccurate quantum computation with mixed 
states. Using the so called "trace metric" on density matrices, we show how to keep track 
of errors in the new model. 
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1 Introduction 



In the last few years theoretical computer scientists have started studying "quantum compu- 
tation". The idea, which originates with Feynman [Q, may be summarized as follows: the 
behavior of quantum physical systems seems to require exponential time to simulate on a (ran- 
domized) Turing machine. Hence, possibly, we could use physical computers built by relying on 
quantum physical laws to get an exponential speedup, for some computational problems, over 
normal (even randomized) Turing machines. If true, then this contradicts what may be called 
the modern Church- Turing thesis: "randomized Turing machines can simulate, with polynomial 
slowdown, any computation device" . 

Deutsch formalized Feynman's idea, and defined computational models of "quantum Tur- 
ing machines" and "quantum circuits" |Q. These are extensions of the classical models of 
Turing machines and circuits, that take into account the laws of quantum physics. Deutsch's 



model, augmented with further work |14| , enabled a sequence of results p|, |T3|, |g] culminat- 
ing with Shor's polynomial quantum algorithm for factoring integers |T^. However, it seems 
that Deutsch's model is incomplete in some key aspects which make working formally within 
it rather awkward. It seems that there is still a gap between the physical world and the formal 
definitions, which often leads computer scientists to bring physical phenomena into the model 
through the back door. 

Let us recall the basic definitions used in current models for quantum computation: The de- 
vice operates on n quantum-bits ("qubits"). Each one of the 2" possible Boolean configurations 
i G {0, 1}" of these bits denotes a basic state \i >. The state of the computation at any point 
in time is a superposition of basic states I]iG{o,i}" >, where the Cj's are complex numbers 
satisfying X)i IqP = 1 (i-e. ||c|| = ||c||2 = 1). These superpositions are called pure states. Each 
computational operation is (1) local: i.e. involves only a constant number of qubits (2) unitary: 
i.e. maintains ||c|| = 1. At the end of the computation a measurement of one qubit is made, 
which returns a Boolean value "true" with probability J2i,\i>eM l^iP, where M is a subspace of 
(^({0-1}") specified by the measurement. The state changes (in a well defined way) according to 
the outcome of this measurement, and becomes a probability distribution over pure states. 

Let us consider the model described above. During the computation, the operations must be 
unitary, and the state must be a pure state. But at the end of the computation, a non-unitary 
operation, a measurement, is applied, and the state becomes a probabilisty distribution over 
pure states, or what is called a mixed state. So we find out that in quantum physics operations 
might also be non-unitary, and states are not nessecarily pure states. Restricting the model to 
unitary gates and pure states seems arbitrary. It is natural to ask: what is the most general 
model that captures quantum physics? In this paper we define a quantum circuit which is 
allowed to be in a general quantum states, i.e. a mixed state, and which is allowed to use any 
quantum operation as a gate, not necessarily unitary. Our first result is a simple corollary of 
known results in physics: ||^: 

Theorem 1 The model of quantum circuits with mixed states is polynomially equivalent in 
computational power to the standard unitary model. 
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There are several key issues, which we find inconvenient, difficult, or impossible to deal with 
inside the standard, unitary model. In the non-unitary model of quantum circuits with mixed 
states, these problems essencialy disappear. 

• Measurements in the middle of computation: it has been often remarked, and 
implicitly used (e.g. Shor's algorithm), that quantum computations may allow measure- 
ments in the middle of the computation. However, the state of the computation after a 
measurement is a mixed state, which is not allowed in the standard model. This problem 
no longer exists in our model. 

• Noise and Decoherence: Noise and decoherence are key obstacles in implementing 
quantum computers devices. Recent results show that theoretically, fault tolerant com- 
putation existsflll, |l|, |T0|, P]. Still, in the task of realizing quantum computers, it is likely 
that more theoretical work will be needed in this direction. A key problem in this inter- 
face between quantum physics and quantum computation models is the fact that quantum 
noise, and in particular, decoherence, are non-unitary operations that cause a pure state 
to become a mixed state. Incorporating quantum noise, which was impossible in the 
standard unitary model, is naturally done in our model. 

The main technical result of this paper is a soulution of one more problem in the unitary 
model, namely the subroutine problem. A cornerstone of computer science is the notion of 
using subroutines (or oracles, or reductions): once we are able to perform an operation A within 
our model, we should be able to use A as a "black box" in further computations. The general 
and natural function that a quantum computer outputs is a probabilistic function: for an input 
X, the output is distributed according to a distribution, Dx, which depends on the input. 
When using such a probabilistic function as a subroutine, the state is affected in a non-unitary 
manner. Therefore, using subroutines in their full generality was never deffned in the unitary 
model. This is an incompleteness of the current model, because computatble functions can not 
be used as "black boxes" . 

The special case of using deterministic subroutines was defined inp|, and it was shown that 
using determinstic subroutines does not strengthen the model. As to probabilistic subroutines, 
these were implicitly used in quantum algorithms, ( e.g. Shor's algorithm,) but always on 
a classical input, for this case can be easily understood. A conceptual difficulty lies in the 
combination of applying probabilistic subroutines to quantum superpositions. It is not clear what 
the natural definition should be. Here, we are able to give a natural definition which generalises 
both the case of deterministic subroutines on superpositions, and the case of probabilistic 
subroutines on classical inputs. We prove that using general subroutines does not strengthen 
the model. Let us define FQP to be the set of probabilistic fuctions computed by uniform 
quantum circuits with polynomial size and depth. Our main result is: 

Theorem 2 FQP^^^ = FQP. 

We hope that this new tool will be useful in quantum algorithms. 
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As to the formalism, it turns out that the description of the state of the circuit by a probabil- 
ity distribution over pure states is not unique. Physicists use an alternative unique description, 
namely density matrices, which provides many conceptual and practical advantages. In this 
paper, we give all definitions and proofs using the density matrix picture. As an example of 
the benefits of dealing with quantum complexity questions with density matrices, we provide 
a simple lower bound on the depth of a circuit which computes probabilistic functions. The 
same lower bound seems difficult to prove when using the standard language of pure states. 

A crucial point in a computation model is understanding how inaccuracies in the basic 
computational elements affect the correctness of the computed function. In order to keep track 
of the error in the function computed by a circuit in a mixed state, one needs appropriate 
metrics on probabilistic functions, density matrices and gates. For probabilistic functions, we 
use a natural metric, relying on total variation distances. As a metric on density matrices, we 
propose to use the trace metric, induced by the trace norm: \\H\\ = where Aj are the 

eigenvalues of H. We show that it is an appropriate metric since it quantifies the measurable 
distance between two quantum states. We also define a metric on quantum gates which has very 
nice properties. An error in a function, a gate, or a density matrix will be the distance (in the 
corresponding metrics) from the correct function, gate, or density matrix, respectively. Using 
the above metrics, we establish the following fact: 

Theorem 3 Let Q be a quantum circuit which uses L probabilistic subroutines and gates, each 
with at most e error. The function that Q computes has at most 0{Le) error. 

Organization of Paper In section 2 we provide the physical background, in a mathematical 
language. In section 3 we define our model. Section 4 provides the basic theorems regarding 
the model, and includes an example of complexity bound using density matrices. Section 5 
discusses the metrics and errors. 

2 Some Useful Physics Background 

The model of Quantum computers is based on the rules of quantum mechanics. A good reference 
for basic rules is [§. 

Pure states: A quantum physical system in a pure state is described by a unit vector in 
a Hilbert space, i.e a vector space with an inner product. In the Dirac notation a pure state 
is denoted by The physical system which corresponds to a quantum circuit consists of n 
quantum two-state particles, and the Hilbert space of such a system is Tig = C^^'^^" i.e. a 2" 
dimensional complex vector space. 7^2 is viewed as a tensor product of n Hilbert spaces of one 
two-state particle: TC2 = 'H2 ® 7^2 ® • • • <S) H2. The /c'th copy of TC2 will be referred to as the fc'th 
qubit. We choose a special basis for Tig, which is called the computational basis. It consists 
of the 2" orthogonal states: \i),0 < i < 2", where i is in binary representation. \i) can be seen 
as a tensor product of states in H2'- \i) = \ii)\i2) ■■■■\in) = Ki^2--^n), where each ij gets or 1. 
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Such a state, corresponds to the j'th particle being in the state A pure state \a) e 7^2 
is generally a superposition of the basis states: \a) — Cj|z), with |cjp = 1. A vector in 
H2, Va = (ci, C2, 02"), written in the computational basis representation, with X]i=i |ciP = 1, 
corresponds to the pure state: \a) = YjfLi v]^, the transposed-complex conjugate of f^, is 
denoted The inner product between \a) and |/3) is denoted (a|/3) = {yl^^vp). The matrix 
vl^vp is denoted as |q;)(/3|. 

Mixed state: In general, a quantum system is not in a pure state. This may be attributed 
to the fact that we have only partial knowledge about the system, or that the system is not 
isolated from the rest of the universe, so it does not have a well defined pure state. We say 
that the system is in a mixed state, and assign with the system a probability distribution, 
or mixture of pure states, denoted by {a} — {pfe, \o.k)}- This means that the system is with 
probability p}~ in the pure state \ak)- This description is not unique, as different mixtures 
might represent the same physical system. As an alternative description, physicists use the 
notion of density matrices, which is an equivalent description but has many advantages. A 
density matrix p on 7^2 is an hermitian (i.e. p = p^) semi positive definite matrix of dimension 
2" (g) 2" with trace Tr(p) = 1. A pure state \a) — qK) is represented by the density matrix: 
P\a) = i-e. P|a)(i j) = Qc*. (By definition, p{i,j) = {i\p\j)). A mixture {a} = {pi,., \ai)}, 

is associated with the density matrix p^^} = J2i Pi\cti) ■ This association is not one-to-one, 
but it is onto the density matrices, because any density matrix describes the mixture of it's 
eigenvectors, with the probabilities being the corresponding eigenvalues. Note that diagonal 
density matrices correspond to probability distributions over classical states. Density matrices 
are linear operators on their Hilbert spaces. The following notations will be used: if J\f is 
a finite-dimensional Hilbert space then L(A/') is the set of all linear operators on Af. Also, 
L(A/', M.) stands for the set of linear operators M ^ M.. 

A density matrix of n qubits can be reduced to a subset, A, of m < n qubits. We say that 
the rest of the system, represented by the Hilbert space !F — C^"""* , is traced out, and denote 
the new matrix by p\a = Trjrp. It is defined as follows: p\A{i-,j) = J2T=i' pi^k,jk). Actually, 
the partial trace Trjr : L(A/' ® JF) L(A/') is defined for any pair of finite-dimensional Hilbert 
spaces J\f and J-'. In words, it means averaging over J^. Any quantum operation which does 
not operate on T commutes with this tracing out. 

Operations on quantum states Transformations of density matrices are linear operators 
on operators (sometimes called super- operators) . Any physically allowed super-operator T : 
L(A/') — >■ L(A1) sends density matrices to density matrices. This is equivalent to say that T 
is positive and trace-preserving. (A super-operator is called positive if it sends positive semi- 
definite Hermitian matrices to positive semi-definite Hermitian matrices). However, this is 
not enough for a super-operator to be physically allowed. The positivity must remain if we 
extend the spaces A/" and Ai by adding more qubits. That is, the super-operator T^ljr must be 
positive, where I:r is the identity super-operator on an arbitrary finite-dimensional Hilbert space 

Such T is called a completely positive map. Hence physically allowed quantum operations 
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are linear trace preserving completely positive maps. Clearly, linear operations on mixed states 
preserve the probabilistic interpretation of the mixture, because T o p = T o Pi\(^i) {c^iD = 
EiPiT o {\ai){ai\). 

One example for a super-operator is the partial trace map which we defined before, Tijr : 
L(7V' ® ^) — > Li{J\f). Another very important example is a unitary embedding V : Af ^ Ai. 
This defines the super-operator T : p \—>- V pV'^ . A unitary embedding naturally appears when 
we add a blank qubit to the system, Vq : \i) ^ \i) ® |0) : C^" — > C^"^\ It turns out that any 
physically allowed super-operator is a combination of these two. 

The following lemma provides the link between super-operators and standard unitary oper- 
ations. It turns out that any super-operator from n to m qubits is equivalent to the operation 
of a unitary matrix on 2n -|- m qubits. 

lemma 1 (modification of Choi (1970),Hellwig and Kraus(1975), and Schumacher (1996)^): 
The following conditions are equivalent: 

1. A super- operator T : L(A/') L(A^) is trace-preserving and completely positive. 

2. There is a Hilhert space T with dim(jF) < dim(A/') dim(A^), and a unitary embedding 
V ■ U ^U®!" such that Tp = Ti^{VpV^) Vp G L(A/'). 

A super-operator corresponding to a unitary transformation on a space A/", \a >i— U\a >, 
sends a quantum state p = \a){a\ to the state UpW. Such a super-operator is denoted by 
U ■ W . This is one important of a physically realizable super-operator, and corresponds to the 
standard unitary operations. 

Super-operators can be extended to operate on larger spaces by taking tensor product with 
the identity operator: T : L(A/') L{M) will be extended to T® / : L(A^®7^) L(7W ®7^). 
Usually, we will be interested in those super-operators that are extensions of super-operators 
on spaces with small dimensionality. This will correspond to local gates later on. 

In order to describe the operation of super-operators on density matrices G L(A/') is suffices, 
from linearity, to specify what happens to a basis which spans the density matrices: 
where run over all basis vectors of A/". (Any density matrix can be written as p = 

PijN)(i|-) For example the unitary operation \i) i — > \vi) corresponds to the super-operator 
specified by \i){j\ i — ^ \vi){vj\. If U is extended to G L{Af ® M) \i,k){j,l\ = \i){j\ (g) \k){l\ \ i — > 
{\v,){v,\)0{\k),{l\). 

Measurements A quantum system can be measured, or observed. Let us consider a set of 
positive semi-definite Hermitian operators {Pm}, such that J2m Pm = I- The measurement is a 
process which yields a probabilistic classical output. For a given density matrix p, the output 
is m with probability Pr(m) = Tr(FmP)- 

We will use only projection measurements. Namely, WG cissiirnG t licit Pfn 

are orthogonal 

projections onto mutually orthogonal subspaces Sm which span the whole space M = C^", i.e. 
A/" = Sm- A more particular type of measurement which we will be using a lot is a basic 
measurement of r qubits. In this case, Pm (with 1 < i < 2^) are the projection on the subspace 
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Sm, which is the subspace spanned by basic vectors on which the measured qubits have the 
values corresponding to the string m: — span{|m,j), j — 1, . . . , 2"~'''}. This process 
corresponds to measuring the value of r qubits, in the basic basis, where here, for simplicity, 
we considered measuring the first r qubits. 

The classical result of a measurement, m, can be represented by the density matrix \m){m\ 
in an appropriate Hilbert space M.. The state of the quantum system after a projection 
measurement is also defined; it is equal to Pj:{m)~^ PmpPm- (It has the same meaning as a 
conditional probability distribution). Thus, the projection measurement can be described by a 
super-operator T which maps quantum states on the space A/" to quantum states on the space 
M ® M-i the result being diagonal with respect to the second variable: 

Tp = Y.iPmPPm)®{\m){m\) 

m 

3 Quantum Circuits with Mixed States 

We define here a model of quantum circuits, using density matrices. This enables us to apply 
general non-unitary gates. The circuit is defined to compute probabilistic functions, which are 
a generalization of Boolean functions computed by a standard quantum circuit. A quantum 
gate is defined to be the most general quantum operation: 

definition 1 A quantum gate, g, of order {k, I) is a trace preserving, completely positive, 
linear map from density matrices on k qubits to density matrices on I qubits. Its action on the 
density matrices is denoted as follows: p i — > 9 ° P- (The "o" sign is used for clarity and could 
be omitted). 

The unitary gate, U , of the standard model is a special case of a quantum gate. The 
corresponding super-operator is U ■ W . Using our "o" notation, we can denote it simply by U 
with no danger of confusion. Thus, U o p = UpW. 

A measurement is also a special case of a quantum gate — the probabilistic projection onto 
a set of mutually orthogonal subspaces. Besides changing the state of the qubits, it produces a 
classical probabilistic result. (As shown above, both results can be described by a joint density 
matrix, but it is better to take the advantage of the second result being classical). 

We now define a quantum circuit: 

definition 2 Let Q be a family of quantum gates. A Quantum circuit that uses gates 
from Q is a directed acyclic graph. Each node v in the graph is labeled by a gate gv & G of 
order {kv,lv). The in-degree and out-degree of v are equal ky and ly, respectively. An arbitrary 
subset of the inputs are labeled "blank". An arbitrary subset of the outputs are labeled "result". 
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Here is a schematic example of such a circuit. We use "b", 



'r" for "blank", "result". 




92 



93 




b b 



The circuit operates on a density matrix as follows: 

definition 3 final density matrix.- Let Q be a quantum circuit. Choose a topological sort 
for Q: 9t---9i- Where gj are the gates used in the circuit. The final density matrix for an initial 
density matrix p is Q o p = o ... o g2 o o p. 

Usually there exists more than one topological sort for a given circuit Q. Yet Q o p is well 
defined and operations of the gates in a quantum circuit determined by two different topological 
sorts are equivalent: 

lemma 2 9t° ■■■ ° g2° 9i° P = 9cj{t) o .•• o (^0.(2) o g^^^^ o p, where a is a permutation, If the two 
orderings are two topological sorts of the same quantum circuit. 

The reason for this is that two gates that operate on different qubits commute. The proof 
of lemma |^ easily follows from: 

lemma 3 Let gi,g2 be two quantum gates operating on different qubits. Then\l p , gio g2° P = 
92° 91° P- 

Proof: To extend the gates to operate on the whole set of n qubits, we tensor with the 
identity. For simplicity, let us assume that gi operates on the first ki qubits (the Hilbert space 
A/"), and g2 operates on some of the rest of the qubits (the Hilbert space M). We only need 
to show that {gi ® \m) ° (W ® ^'2) = 9i® 92 and (I^r ® 92) ° {91 ® !») = 9i® 92- These are 
simply particular cases of the identity 



where Xi.,Yi and X2,Y2 act on arbitrary linear two spaces. (The "o" signs are omitted here). I 
Now we are ready to define the function that the circuit computes. The circuit produces 
a probability distribution over strings of r bits, which depends on it's input string. The prob- 
ability distribution is computed out of the final density matrix, and is the same probability 
distribution as one would get over the strings of outcomes, if at the end of the computation we 
apply a basic measurement of all the "result" qubits. 



(Xi ® X2) (Fi ® Y2) 



(XiFi) ® {X2Y2) 
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definition 4 Computed function: Let Q be a quantum circuit, with n inputs and r "result" 
outputs. The probabilistic function that Q computes, fq = f: {0, 1}" i — > [0, , is defined 

as follows: For input i, the probability for output j is fij = {j\ (^Q o ^ \ where A is 

the set of the "result" outputs. 



4 Results 

In this section we provide the theorems which prove that the model is equivalent to the standard 
model and that it allows using probabilistic subroutines. We also provide a simple lower bound 
on the depth of circuits computing probabilistic functions. 

4.1 Substituting General Gates by Unitary Gates 

General non-unitary gates can be replaced by unitary gates, as is shown by the following lemma: 

lemma 4 Let g be a quantum gate of order {n,m). There exists a unitary quantum gate Ug 
on 2n + m qubits which satisfies: For any p, g o p = (Ug o (g) 10"+*") (|0"^''"|^^ ^, where A 
is the set of the first m qubits. 

Proof: This lemma follows from lemma |l|. By definition |l|, the gate g is a, trace-preserving 
completely positive super-operator L(A/') L(A^), where Af = C^" and Ai = C^'". By 
lemma ^ it has a representation of the form g = TTjr[V ■ V^), where JF = (^2"+™^ y j\f ^ 
JF is a unitary embedding. Let us choose an orthonormal basis {|?7i,j), 1<^<2", l<j< 
2n+m,| in (S> JF, such that |?7j,o"+™) = ^N) fo^' ^"^Y ^ (the other basis vectors are arbitrary). 
There is a unique unitary operator U which sends each vector \i,j) of the basic basis to the 
vector \rjij) of the new basis. It is obvious that V = UVq, where Vq : \C) ^ \0 ® lO"^'"). Hence 
g = Trjr(t/Vo ■ VqW). This is what we need. I 

It follows that a circuit with general gates can be simulated by a circuit with unitary gates 
efficiently: 

Theorem |^: The model of quantum circuits with mixed states is polynomially equivalent 
in computational power to the standard model. 



4.2 Using General Subroutines 

A (probabilistic) subroutine is a function, / : {0, 1}™" i — > R^^'^^'' , which outputs j with dis- 
tribution which depends on the input i. A quantum circuit that uses subroutines is a circuit 
in which a node of fan-in=fan-out=m + p may be associated instead of a quantum gate, a 
probabilistic function / : {0, 1}"* i — > R^^'^^'' . Our definition of the way this "subroutine gate". 
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denoted hy gj, effects tfie density matrix is by operating all possible deterministic functions, 
in the standard way, where each deterministic function is applied with the induced probability 
from the probabilistic function: 

definition 5 operation of a subroutine gate: 

gfop^^Pr{d)U,pUl 

d 

Where the sum is over all deterministic functions d : {0, 1}"* i — > {0, 1}^, Ud is defined by 
Ud\i,0 >= \i,d{i) > and the induced probability for d is: 

i i 

Note that as discussed in the introduction, this definition generalizes both the case of de- 
terministic subroutines on superpositions, and the case of probabilistic subroutines on classical 
inputs. The sum contains 2^ summands. It turns out that the same operation on density 
matrices can be written in a much more compact form. 



lemma 5 



^/0(|Z1,0)(Z2,0|)= (1) 

f Ej fij\hj){hj\ ifk = i2 = i 

Kl,il)(«2,i2| ifil^i2 



Proof: 



^Pr(ci)[/dKi,0)(z2,0|[/i = 

d 

^Pr((i)|^l,d(^l))(^2,ci(^2)| = 

d 

E Prid))\i^,dii^)){i„dii,)\. 

31,32 d,d(ii)=ji,d(i2)=32 

We now compute the term in the brackets. If ii — 12-1 it becomes: 



<^(ii,i2) Ed,d(n)=ji rii Prob(i ^ d{i)) = 

5{jl,32)fh,h ^ji,i¥^h Hi fi,ji = (^(jl, J2)/^lJl• 



If ii ^ i2, then the same computation yields 



EdAii)=hAi2)=32 rii Prob(i ^ d{i)) = 

/n,ji/i2,i2 Ejj, 2^11,17^12 rii fii,jifi2,j2- ^ 
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Before we state our main theorem, here are some notations: For a family of gates Q we 
denote by Ug the set of unitary gates corresponding to gates from Q, according to lemma |[ We 
denote by Ug the set of daggered unitary gates corresponding to gates from Q. C is a special 
unitary gate on two qubits, the controlled not gate. It satisfies C|00 >= |00 >, C|10 >= |11 >, 
and thus serves as a copying gate. 

Theorem |2|: FQP^^^ = FQP: Let Q he a set of gates, S a set of probabilistic functions 
from m to r bits, computable by quantum circuits using no more than k gates from Q. Let Q be 
a quantum circuit, which uses n gates from Q , and I subroutines from S, there exists a quantum 
circuit Q which uses no more then n + l{0{k) + 0(r) + 0{m)) gates from UgUUg U C and 
computes fq. 

Proof: We now show that a subroutine s G S", for s : {0, 1}™ i — > R^'^'^^'^ , can be replaced 
by 0{k) + 0{r) + 0{m) gates from Ug UUg U C. The idea is to apply Qs, read the result by 
copying it to extra r qubits, and undo the subroutine. Up till now, this is just following the line 
of the proof for deterministic subroutines 0. However, this is not enough when dealing with 
probabilistic functions. The reason, intuitively, is that in probabilistic subroutines there is more 
than one possible output for one input, so the state of the bits that are used to copy the output, 
is not in tensor product with that of the input and output bits, even if the input was classical. 
Hence, undoing the subroutine does not take the input bits back to their original input state. 
The reader is urged to try and see for herself why more effort is needed. We proceed by the 
following operations: We add m + 1 blank qubits to the circuit. The last bit will be a garbage 
controll bit. First, we check if there is garbage left, i.e. if the string written on the qubits other 
than the main qubits is different from zero, and if so, we change the garbage control bit to 
Then, conditioned that the garbage control bit is one, we copy the input m bit string of the 
subroutine to m ancilla bits. If there is no garbage, we leave the ancilla bits to be blank. Then 
we trace out, or discard, the garbage and the m + 1 ancilla bits. This procedure results with 
the same operation as the subroutine gate, gs, and we have used 0{k) + 0{r) + 0{m) gates. 

Let us agree on some notation before continuing: Let the subroutine s be computed by the 
circuit Qa, which uses only unitary gates from Ug, using theorem ^. Thus the operation of Qs 
is unitary, and is described by the unitary matrix Ug = U. So the final density matrix of Qs is 
a density matrix of a pure state, which can be written as U\i, > for an input i, and r blank 
qubits. We can write: 

U\i, x) = \i) ® Ui\x) 

= where = fij 

i 

Let us track the procedure step by step. We will do that by seeing what happens to a 
matrix of the form \i\){i2\- From linearity, this will be enough. 
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When j is copied, the above expression becomes 

j 

The first two registers will be referred to as the quhits, and the last two registers will be 
discarded later. Then U"^ — W is apphed which yields 

Let us represent as \r)i) + l^i), where 

\Vi) = J2{^\Uhj:^v)\hj)(S)\0> 
j 

corresponds to the possibility of having no garbage, and = ~ l^i) is orthogonal to 
\r]i). Note that (0|C//|j, V^jj-) = {tpijl'fpij) = fij, so 

= fij <8) |0 > 

We now add the step of computing the state of the control garbage qubit, and conditioned on 
that coping the input. The overall procedure can be represented as follows 

\i) 1-^ \r]i,no,0) + \i/i,yes,i) 

where no and yes are states of the controlled garbage bit. As long as the garbage and the 
ancilla bits are discarded, i.e. the reduced density matrix on the original set of qubits (denote 
it by Q) is taken, we have: 

\ii){i2\ ^ 

(hi, no, 0) + \ui^,yes,i))({rii^,no,0\ + {ui^,yes,i\)\Q = 
= (l^n)(^i2l)lQ + (kn>KI)lQ<^iJ 

For ii 7^ ^2, we have: 

31, h 

For ii — 12-1 we have to go few steps back in our calculations. Recall that the vectors \r]i) and 
were defined in such a way that = (because \r]i) corresponds to no garbage 

whereas \vi) corresponds to non-null garbage). Hence 

{mvi\ + \yi)mQ-{\mi\)\Q 

= Y.i3 ,^ij'\UiUl\j,'il)ij)\i, j'\ = 

3,3' 

^J2fi3\hj){'^^j\- 

j 

Thus we have the desired transformation. | 
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4.3 Simple Lower Bounds on Probabilistic Functions 

We prove a lower bound on probabilistic functions. The proof relies on causality, which can 
be stated as follows. Consider a quantum circuit Q, and two qubits a and b. The two bits are 
correlated only if there is a gate from which there is a path to them both. This will imply a 
lower bound on probabilistic functions where one qubit is correlated to many others. 

lemma 6 Causality lemma: Let Q be a quantum circuit, with gates gt,...gi. Let p be a 

density matrix of a basic state. If Q o p\a^b is not a tensor product, there exist i such that there 
are two (directed) paths in the circuit: gi i — > aj and gi i — > hf. 

Proof: Let us assume that there is no i such that there are two (directed) paths in the 
circuit: Qi i — > aj and gi i — > hf. Let us now find a topological sort of all gates from which 
there is a directed path to a/ (and therefore not to 6/), and let us call this set of gates Ga- Let 
us sort the set similarly, and the rest of the gates Gc also. We claim that the sort GcG^Ga 
is a topological sort of the circuit. To show this, we need only show that if there is a path from 
gi to gj in the circuit, then gj appears to the left of gi. The only thing we have to check is that 
there is no path from gate Qc in Gc to any gate in Ga (Gb). But if there was such a path, then 
gc would have belonged to Ga (Gf,). Now,(G'c o o o p)\a,b = {Gb o Ga^ p)\a,b due to the 
following lemma: 

lemma 7 Let g be a gate operating on qubits not in the set B. p\b = [g ° p)\b- 

proof: Let B be described by first indices, and g operates on the space described by second 
indices. 

p= J2 pikdi\'^^k){j,i\, p\b = J2(J2p(^k^jk))\'^)ij\ 

i,k,j,l i,j k 

To apply the gate g, we use the equivalent unitary gate Ug according to lemma |[ 

9°P = Pikji\i){j\®Ug\k){l\Ul = 

i,k,j,l 

E P^k,Jl\^){J\^\k'){k'\Ug\k){l\Ul\l'){l'\. 
i,j,k,l,k',l' 

Computing the reduced density matrix we get: 

9op\B = T.\^){j\ij:p^kAk'\u,\k){i\ul\k'),) 

i,j k,l,k' 

but j:k'{k'\Ug\k){l\Ul\k') = {l\UgUl\k) = 6k,i.t 

The set of qubits A, B which Ga and Gb operate upon are disjoint, according to our as- 
sumption. Let A', B', be sets of qubits such that their union is all the qubits, and A' D A, 
B' D B. We can write: 

{Gb oGaO p)\a,b = {Gb oGaO {p\a' ® P\B'))\a,b = 
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{Ga O p\a') ® {Gb O p\B')\a,b = {Ga O p\A')\a ® (Gfo O p\B')\b- 

Which shows that the final reduced density matrix is a tensor product. I 
Let us define the correlation graph for a state: 

definition 6 Correlation graph: Given a state p of n quhits, we define the correlation graph 
Gp{V,E) of the state as follows. The set of nodes V will consist of n nodes, corresponding to 
the n quhits. An edge {a,b) G E iff the reduced density matrix p\a^b is not a tensor product. 

We now claim that the depth of the circuit with final density matrix p is larger than the 
logarithm of the maximal degree in the correlation graph Gp. 

lemma 8 Let Q be a quantum circuit, with all gates of fan-in < k. Let the maximal degree 
of the correlation graph of Q o \i > be c, for some input i. Then the depth of Q satisfies 
D{Q) > ^logkic). 

Proof: By causality, if there is an edge in the correlation graph between qubits a, b then in 
the circuit there is a node gi such that there are two (directed) paths in the circuit: gi \ — >• Uf 
and gi i — > bf. For a circuit of depth D, and a given qubit a, the maximal number of qubits 
which are connected to a in such a way are k"^^. So c < k"^^, and hence D{Q) > ^logk{c).§ 

The correlation graph can be defined for probabilistic functions as well. If the output is 
probabilistic string of r bits, it will be a graph of r nodes. Edges will connect pairwise correlated 
bits. 

lemma 9 Correlation bound: Let Q be a quantum circuit computing f , a probabilistic func- 
tion. Let c be the maximal degree of the correlation graph of f . Then D{Q) > logk{c). 

As a trivial example, consider the probabilistic function that outputs (for any input) with 
probability ^ the string C and with probability | the string l*". The lemma shows that a circuit 
that computes this function must be of depth larger than log{r). 

5 Precision and Errors 

In the theory of quantum computation (as in the real life) operators, quantum states, etc. are 
defined with some precision. Thus, we need to define certain metrics on the corresponding 
spaces. We will find a natural metric (more specifically, a norm) for each class of objects 
we deal with: pure and mixed states, unitary and arbitrary gates. After proving some basic 
properties of these norms, we will show, in a very general form, that error accumulation in 
quantum computation is at most additive (see Theorem | below). 
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5.1 The Natural Distance Between Probabilistic Functions 



We need a measure for the accuracy of the function computed. The natural norm to use is the 
£i-norm, called the total variation distance (t.v.d.) between probabihty distributions. We use 
t.v.d. to define a metric on probabilistic functions. 

definition 7 Let f,g be probabilistic functions. For input i, fi,gi are probability distributions. 
The total variation distance between fi,gi is \fi—gi\ J2j \fi,j~9i,j\ C'f^d \\f —g\\ = maxi\fi—gi\. 



5.2 The Trace Metric on Density Matrices 

Precision of a vector |^) G A/" (where A/" is a Hilbert space) is characterized by the natural 
(Euclidean) norm ||^)|| = 



Since we have passed to density matrices, we need a metric 
on general quantum states. There are two natural norms on the space of linear operators on 
H: the usual operator norm, 



sup 



I^IOII 



r«7o 110 II 

and the dual norm called the trace norm., 

iTrASl 



sup ■ 



\B\ 



largest eigenvalue of vAL4 



(2) 



(3) 



The norms 
lemma 10 



and II • 111 are well behaved. Specifically, if A e L(A/') and B e L(A4) then 



\\A®B\\ = \\A\\ \\B\\, 
\\A (g) B\\i = \\A\\i \\B\\i 
\\AB\\ < \\A\\\\B\\ 
\\AB\U, \\BA\U < \\B\\ 
TrAl < \\A\U 



(4) 



(Proof is trivial). 

There are many good reasons to use the trace norm as the norm on density matrices 
(though the operator norm will be very useful in proofs.) First, two pure states \^), \r]) which 
are close in the Euclidean norm are close also in the trace norm: 



\0{C\-\v){v\ 



-\m\' < 2 \o-\v) 



The important feature of the trace metric is that it captures the measurable distance between 
different density matrices. It turns out that the trace distance between two density matrices 
equals the following quantity. For each observable O, a density matrix p induces a probability 
distribution, p^, over i's. The trace distance between two density matrices is the maximal t.v.d 
between the two probability distributions, taken over all possible observables. 
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lemma 11 



I Pi — P2II1 = maxo 



Proof: Let J\f = 0^ Sj, where the subspaces Sj are mutually orthogonal. Let Pj be the 
orthogonal projection onto Sj. Then, for any pair of mixed states pi and p2, J2j Tr(FjPi) — 

Tr(Pj/92) < IIpi — P2||i- To see this, present the left hand side of this inequality as Tr((/9i — 
P2)B), where B = J^j^Pj- It is obvious that ||-B|| = 1. Then use lemma |10|. To see that 
the trace distance can be achieved by some measurement, let O project on the eigenvectors of 
Pi - p2- I 



5.3 The Diamond Metric on Quantum Gates 

The natural norm on the space of super-operators is 

ll^^lli 

ll^lli = sup 

x^o 11-^ 111 

Unfortunately, this norm is not stable with respect to tensoring with the identity. Counterex- 
ample: T : I— >■ = 0,1). It is clear that ||T||i < 1. However ||T(8)/b||i > 2. 
(Apply the super-operator T ® to the operator X = J2i,j \h'^){3ij\ )• For this reason, we 
have to define another norm on super-operators 

definition 8 Let T : L(A/') L(A^) and A, B e L(A/', M ® JF), where T is an arbitrary 
Hilbert space of dimensionality > (dim A/") (dim A^). 



|T||^ = inf{||A|| IISII : TtAA-B^)=t} 



This definition seems very complicated. However it is worthwhile using this norm because it 
satisfies very nice properties, and provides powerful tools for proofs regarding quantum errors. 
Here are some properties which are satisfied by the diamond norm. The first property is that 
the diamond norm is the stabilized version of the "naive" norm || • The proof of this is 
complicated and non-trivial. It implies also that || ■ ||^ is a norm. 

lemma 12 

1. ||T||^ = ||T(g)/g||i > ||T||i, w/iere, dim^ > dimA/". 

2. llTpll^ < ||T||^||p||i 

3. ||T/2||<^ < ||T||^ 

4. \\T0R\\^ = ||T||^||i?||^ 

5. The norm of any physically allowed super-operator T is equal to 1. 
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6. If \\V\\ < 1 and IIW^II < 1 then \\V ■ - W ■ W^^ < 2\\V - W\\. 



Proof of [T2|.l: It is easy to see that \\T ® Ig\\i < \\T Ig\\<} < ll^ll^- The inequality 
ll^llo < 117" ® /gill is not so obvious. W.l.o.g. ||T||^ = 1. We are to prove that |r ® /g||i > 1. 

We will use the following notation. D(/C) denotes the set of density matrices on /C, whereas 
H(jF) is the set of Hermitian operators. We can impose the restriction = ||S|| < 2 without 
changing the infimum in the definition p. Due to compactness, the infimum is achieved at 
some A and B. W.l.o.g. ||A|| = ||i?|| = 1. The quantity ||A|| ||i?|| is minimal with respect 
to infinitesimal variations of the scalar product = {-IZI-) on the space JF. (Here Z is a 

infinitely small Hermitian operator on JF). When computing the variations S\\A\\ and (^||-B||, we 
can restrict A and B to the subspaces /C = Ker^A^A — l_^f) and C = KeT{B'^B — Ij^). Clearly, 

6\\A\\ = max {^\A^{1m ® Z)A\0 
IOg^, ll?>ll=i 

= maxTiiX Z) 

xeE ^ ' 

m\\ = |^^^max^^^-(77|St(l^®Z)5|r7) 
= max — Tt(YZ) 



where 



Thus, for any Z G H(jF) 



E = {Tr^(ApAt): p G D(/C)} 
F = {Ttm{B^B^) : je-D{C)} 



\B\\) = max (Tr(XZ) -Tr(YZ)) > 



This means that the sets E,F <0 H(jF) can not be separated by a hyper-plane. As E and F are 
convex and compact, fl F 7^ 0. Let TiM^ApA^) = TiMiBjB'') e EnF, where p G D(/C), 
7 G D(£). Let us represent p and 7 in the form p = TTg(^\^){^\j, 7 = TTg(^\r]){r]\j, where 
10, 1??) G A/"® ^ are unit vectors. Put X = |0(r/|. Then ||(r® /g)X||i = ||A||i = 1 



Proof of p!2|.2JT^.3: follow from the relation to the norm || ■ ||i and the definition of the latter. 



Proof of [T2|.4: To prove first direction, ||T®i?||^ < ||T||^ ||i?||^ follows from the definition j 



whereas the inverse inequality follows from |l]. 

Proof of |12|.5: W. 1. o. g. ||T||^ = ||T|| 1 (since we can tensor T with the identity). Let 
T = Ti^{V ■ V^) : L(7V) L{M). Let us define the dual super-operator R : L{M) L(7V) 
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with the property: Tr{Y{TX)) = Tr{{RY)X) for every X G L{Af) and Y E L{M). It is 
obvious that RY = 1/t(y ^ and 



irili 



\\RY\ 
Yyto \\y \\ 



As \\V\\ < 1, the inequahty ||T||i < 1 follows immediately. To prove the inverse inequality, take 
the identity operator for Y. 



Proof of |T^.6: To prove this we need lemma |T3| from the next section. (For completeness this 
property appears here). In the definitions of the lemma put Ti = V -I, T[ = W -I, T2 = I -V^ 
T^ = I-W^. I 

The distance between unitary super-operators V ■ V\W ■ W\ can be calculated explicitly, 
and has a geometrical interpretation. Denote by d the distance between and the polygon (in 
the complex plane) whose vertices are the eigenvalues of yVF^. Then 



V -v^ - w -w^W^^ = 

VpV^ - WpW^ 



max 



(The proof is left to the reader). 



5.4 Bounding the Overall Error 

By definition, the error of a quantum gate is measured by the <C>-norm. The accumulation of 
errors is bounded by the following lemma: 

lemma 13 LetTi, T2 andT[, he super- operators with norm < 1, such that ||Tj — Tj||^ < 
(j = 1,2;. Then \\T^T[ - T^T^^ < + e^. 

Proof: Write T^T[ - T2T1 = T!^{T[ - Ti) + (T^ - T2)Ti, and use lemma |T|.5 I 

For subroutines, the natural error measure is different. Fortunately, there is a linear upper 
bound for the <)-norm error of a subroutine: 

lemma 14 Let f and f he two prohahilistic suhroutines, such that ||/ — /'|| < e. Then \\gfi — 
gf\\<> ^ 5e. (The super- operator gj is descrihed in the lemma 

Proof: Let N be the space of the inputs the space of the outputs Define the 

following objects (and their primed versions) 

A : M ^ M : \i) 

B : M ^ M®U : \i) 



^ T.jfij\i:i) 

^ Ejfij\iJ,i) 
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3 

: L(Ar) : \ii){i2\^5,,i, 

Then^/ = A- -i:ij^{B- B^)+Y.i PiPi (the same for ^//). Clearly, ||A'-y4|| < e, \\B'-B\\ < e, 
and the norm of each operator A, B, A', B' does not exceed 1. It remains to show that < e, 

where T = Y.i{p'i - pi)Pi- 

Note that — pj||i < e for each i. Hence ||(r ® < e X^t ll^lli; where Q is an 

arbitrary Hilbert space, X G L(A/'® Q), and 1^ = (P, ® On the other hand, 



< \\P\ 



where P : |ii)(«2| ^ <^jii2Ki)(^2| is a physically realizable super-operator. | 



Due to lemmas |T3| and |TT], an e error generated somewhere in the circuit can not contribute 
more than e error to the computed function. This proves the following theorem: 



Theorem 4 Let Q be a quantum circuit which uses L probabilistic subroutines and gates, each 
with at most e error. The function that Q computes has at most 0{Le) error. 
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